Analysis of Senator Stock Trading

Patrick Cao

Introduction

Insider trading, or trading financial assets using information that is not open to the public, is punishable by law in the United States. It is a serious violation -- a maximum penalty of 20 years in prison, and up to $20 million in fines. The Securities and Exchange Commission (SEC) is responsible for identifying and conducting investigations regarding insider trading.

With that being said, U.S. senators (and other congresspeople) often have access to information about upcoming policies or laws before the general public, yet are still allowed to trade securities. Senators, however, must disclose their trades publicly. More recently, a few senators have been investigated by the SEC for insider trading using nonpublic knowledge given to them about the COVID-19 pandemic. With the rise in popularity of investing, senator stock trading patterns has become to the public's attention. You may have seen posts on the internet implying that there is rampant insider trading going on in the senate. For example, take this post on Reddit that was upvoted over 30,000 times: https://www.reddit.com/r/dataisbeautiful/comments/gjlvnd/.

Is insider trading in the senate as rampant as this post suggests? This tutorial aims to identify potential insider trading patterns by the U.S. senate. In order to answer this question, we will be using data from https://senatestockwatcher.com/, which itself compiles data from https://efdsearch.senate.gov/. efdsearch.senate.gov does not have an official API, and scraping each page would be extremely tedious. Senate stock watcher aggregates financial disclosures from all senators into one nice dataset.

Let's first start by downloading the dataset. While Senate Stock Watcher provides an API for up to date datasets, we can just download a csv file as well. Download the "All Transactions" csv file from https://senatestockwatcher.com/api.html.

Let's load that into a pandas dataframe.

In [17]:
import pandas as pd
import numpy as np
import datetime

df = pd.read_csv('all_transactions.csv')
df.head()
Out[17]:
transaction_date owner ticker asset_description asset_type type amount comment senator ptr_link
0 11/16/2020 Spouse BA The Boeing Company Stock Purchase $15,001 - $50,000 R Pat Roberts https://efdsearch.senate.gov/search/view/ptr/b...
1 11/16/2020 Spouse V Visa Inc. Stock Purchase $15,001 - $50,000 555 Pat Roberts https://efdsearch.senate.gov/search/view/ptr/b...
2 11/24/2020 Spouse COST Costco Wholesale Corporation Stock Purchase $15,001 - $50,000 555 Pat Roberts https://efdsearch.senate.gov/search/view/ptr/b...
3 11/04/2020 Spouse COST Costco Wholesale Corporation Stock Sale (Full) $15,001 - $50,000 555 Pat Roberts https://efdsearch.senate.gov/search/view/ptr/b...
4 11/09/2020 Spouse BA The Boeing Company Stock Sale (Full) $15,001 - $50,000 sep Pat Roberts https://efdsearch.senate.gov/search/view/ptr/b...

Let's clean this up a little by limiting and purchases and sales only, and dropping unnecessary columns.

In [18]:
start_date = '2016-01-01'
end_date = '2020-11-01'

# Turn transaction_date from string into datetime
df['transaction_date'] = pd.to_datetime(df['transaction_date'])

# Remove (Full) and (Partial) after Sale type
df['type'] = df['type'].apply(lambda x: str(x).split()[0])

# Limit to only purchases and sales
df = df[(df['type'] == 'Purchase') | (df['type'] == 'Sale')]

# limit the dataset to a date range
df = df[(df['transaction_date'] > start_date) & (df['transaction_date'] < end_date)]

df = df.drop(['asset_description', 'comment', 'ptr_link'], axis=1)
df.head()
Out[18]:
transaction_date owner ticker asset_type type amount senator
76 2020-10-13 Joint -- Corporate Bond Sale $1,001 - $15,000 Sheldon Whitehouse
78 2019-04-05 Spouse NVS Stock Sale $1,001 - $15,000 William Cassidy
79 2020-10-16 Child NVDA Stock Purchase $1,001 - $15,000 Ron L Wyden
80 2020-10-16 Child WDAY Stock Purchase $1,001 - $15,000 Ron L Wyden
81 2020-10-16 Child NVDA Stock Purchase $1,001 - $15,000 Ron L Wyden

We got our senator stock trading data, but now we need actual historical market data to visualize these trades. Let's use yfinance, a python Yahoo! Finance library, to get historical market data on SPY, an ETF tracking the S&P500.

In [19]:
import yfinance as yf

spy = yf.Ticker('SPY')
spy_hist = spy.history(period='max')
spy_hist.tail()
Out[19]:
Open High Low Close Volume Dividends Stock Splits
Date
2020-12-07 369.019989 369.619995 367.720001 369.089996 48944300 0.0 0
2020-12-08 367.720001 370.779999 367.670013 370.170013 42458900 0.0 0
2020-12-09 370.880005 371.049988 365.950012 366.850006 74098300 0.0 0
2020-12-10 365.369995 367.859985 364.429993 366.730011 57617300 0.0 0
2020-12-11 364.899994 366.740204 363.260010 366.140015 57698614 0.0 0

Neat! Now we can start putting these two things together.

Exploratory Data Analysis

Senator Data

Let's take an initial look at the data. It would be interesting to see the most traded stocks, the most active senators, or the average value of each trade.

In [20]:
import matplotlib.pyplot as plt

# number of trades by senators
num_trades = df.groupby('senator').count().sort_values(by='transaction_date', ascending=False).head(10)
plt.title('Top 10 Most Active Senators')
plt.xlabel('Total Number of Trades')
plt.ylabel('Senator')
plt.barh(num_trades.index, num_trades['transaction_date'])
plt.show()

# number of trades by ticker
trades_by_ticker = df[df['ticker'] != '--'].groupby('ticker').count().sort_values(by='transaction_date', ascending=False).head(20)
plt.title('Top 20 Most Traded Stocks')
plt.xlabel('Total Number of Trades')
plt.ylabel('Ticker')
plt.barh(trades_by_ticker.index, trades_by_ticker['transaction_date'])
plt.show()

# values of trades
trade_val = df.groupby('amount').count().sort_values(by='transaction_date', ascending=False)
plt.title('Value of Trades')
plt.xlabel('Number of Trades')
plt.ylabel('Value of Trade ($)')
plt.barh(trade_val.index, trade_val['transaction_date'])
plt.show()

It looks like most trades by senators are small -- under $15,000 per trade -- and they like trading AAPL, which is coincidentally the largest market cap company in the world. There's a good mix of tech, pharma, communications, and retail stocks that are the most commonly traded in the senate. There's not too much insight we can pull from this, but it is interesting.

Let's now try to recreate the reddit post. We need to plot overall senator trading activity against the S&P500, or SPY.

It's a little tough to track performance exactly. The size of each trade isn't an exact number -- it's a range. Let's check out what kinds of values we're working with here.

In [21]:
df['amount'].unique()
Out[21]:
array(['$1,001 - $15,000', '$15,001 - $50,000', '$1,000,001 - $5,000,000',
       '$250,001 - $500,000', '$500,001 - $1,000,000',
       '$50,001 - $100,000', '$100,001 - $250,000',
       '$5,000,001 - $25,000,000'], dtype=object)

Yikes. The data doesn't have the exact value of securities traded, the best thing we can probably do is estimate the size of each trade by just taking the middle value of each trade. Unfortunately, the granularity of each bucket is pretty big, but we'll have to try and make do.

In [22]:
# Returns the average of the given bound
def getBounds(row):
    
    amount = row[0]
    trade_type = row[1]

    # This amount shows up once in the entire dataset. Let's just assume the trade is $50m for now. It won't affect our analysis too much.
    if amount == 'Over $50,000,000':
        return 5e7
    
    split = amount.split(' - ')
    lower = int(split[0][1:].replace(',', ''))
    upper = int(split[1][1:].replace(',', ''))
    mid = lower + ((upper - lower) / 2)

    if trade_type == 'Purchase':
        return mid
    return -mid

df['est_amt'] = pd.Series(df[['amount', 'type']].apply(getBounds, axis=1))

df
Out[22]:
transaction_date owner ticker asset_type type amount senator est_amt
76 2020-10-13 Joint -- Corporate Bond Sale $1,001 - $15,000 Sheldon Whitehouse -8000.5
78 2019-04-05 Spouse NVS Stock Sale $1,001 - $15,000 William Cassidy -8000.5
79 2020-10-16 Child NVDA Stock Purchase $1,001 - $15,000 Ron L Wyden 8000.5
80 2020-10-16 Child WDAY Stock Purchase $1,001 - $15,000 Ron L Wyden 8000.5
81 2020-10-16 Child NVDA Stock Purchase $1,001 - $15,000 Ron L Wyden 8000.5
... ... ... ... ... ... ... ... ...
7468 2016-01-06 Self UA Stock Purchase $50,001 - $100,000 John Hoeven 75000.5
7470 2016-11-23 Joint -- Stock Sale $1,001 - $15,000 Patrick J Toomey -8000.5
7471 2016-01-04 Joint -- Stock Sale $1,001 - $15,000 Patrick J Toomey -8000.5
7472 2016-01-05 Joint -- Stock Sale $15,001 - $50,000 Patrick J Toomey -32500.5
7473 2016-01-05 Joint -- Stock Purchase $1,001 - $15,000 Patrick J Toomey 8000.5

6539 rows × 8 columns

Alright, now we got some rough estimates of the value of each transaction, we can group trades together by date and find the aggregate for each day.

In [23]:
aggregate_action_with_type = df.groupby([pd.Grouper(key='transaction_date', freq='D'), 'asset_type', 'type']).agg(est_amt=('est_amt', 'sum'), num_trades=('est_amt', 'count')).reset_index()

# because a sale is the opposite of a purchase, we want to show that
aggregate_action_by_asset = aggregate_action_with_type.copy()
aggregate_action_by_asset['num_purchases'] = aggregate_action_by_asset.apply(lambda row: -row['num_trades'] if row['type'] == 'Sale' else row['num_trades'], axis=1)

aggregate_action_by_asset = aggregate_action_by_asset.groupby(['transaction_date', 'asset_type']).sum().reset_index()
aggregate_action = aggregate_action_by_asset.groupby('transaction_date').sum().reset_index()
print(aggregate_action_by_asset.head())
print(aggregate_action.head())
  transaction_date          asset_type   est_amt  num_trades  num_purchases
0       2016-01-04               Stock   -8000.5           1             -1
1       2016-01-05  Municipal Security   32500.5           1              1
2       2016-01-05               Stock   47006.0          22             12
3       2016-01-06               Stock  139004.5          13              9
4       2016-01-07  Municipal Security   65001.0           2              2
  transaction_date   est_amt  num_trades  num_purchases
0       2016-01-04   -8000.5           1             -1
1       2016-01-05   79506.5          23             13
2       2016-01-06  139004.5          13              9
3       2016-01-07  129005.0          10             10
4       2016-01-08  -32500.5           7             -1

We are now ready to plot senator trades against the market.

In [24]:
spy_hist['Date'] = spy_hist.index

# since we've limited the senator trades to a date range, we want to also limit SPY history
# to the same date range
market = spy_hist[(spy_hist['Date'] > start_date) & (spy_hist['Date'] < end_date)]

def plot_trades_vs_spy(trades, asset_type, metric, y_label):

    # We have a lot of data, so we should make the figure very wide
    plt.figure(figsize=(18,6))

    plt.plot(market['Date'], market['Close'])

    plt.xlabel('Year')
    plt.ylabel('SPY price ($)')
    ax2 = plt.twinx()

    # scale bar graph limits s.t. 0 is centered in graph
    limit = max(abs(trades[metric].min()), trades[metric].max()) * 1.1
    ax2.set_ylim(-limit, limit)
    ax2.spines['bottom'].set_position(('data', 0))

    plt.ylabel('Amount of Securities Traded ($)')

    plt.bar(trades['transaction_date'], trades[metric], width=5, color='orange')
    plt.title('Senator Trades of ' + str(asset_type) +  ' (' + metric + ')' + ' vs. SPY price')

    plt.show()

for asset_type in aggregate_action_by_asset['asset_type'].unique():
    trades_by_asset = aggregate_action_by_asset[aggregate_action_by_asset['asset_type'] == asset_type]
    plot_trades_vs_spy(trades_by_asset, asset_type, 'num_purchases', 'Number of Purchases/Week')
    plot_trades_vs_spy(trades_by_asset, asset_type, 'est_amt', 'Net Securities Traded/Week ($)')

plot_trades_vs_spy(aggregate_action, 'Overall', 'num_purchases', 'Number of Purchases/Week')
plot_trades_vs_spy(aggregate_action, 'Overall', 'est_amt', 'Net Securities Traded/Week ($)')

Is there a relationship between these trades and future SPY price? It would be helpful to try and quantify this and instead plot this data on a scatterplot. Let's plot these trades vs. future SPY price as well. To find the price of SPY a number of days in the future, we need to know all the open market days. Luckily, there is a library for that.

But how long in the future should we check? It's probably not feasible to check a lot of different number of days in the future, so we have to pick this value somewhat arbitrarily. Let's just pick 1 week and 3 weeks. There are 5 trading days per week, so when looking at future trading days, we could look at SPY prices 5 and 15 days in the future.

In [25]:
import pandas_market_calendars as mcal
from datetime import datetime

nyse = mcal.get_calendar('NYSE')
market_open_days = nyse.valid_days(start_date=start_date, end_date=datetime.today().strftime('%Y-%m-%d'))

# returns a function that gets the spy price num_days from today
def get_future_spy_price(num_days):
    def h(today):
        # calculate current spy data
        today_index = market_open_days.get_loc(today, method='nearest')
        today_close = spy_hist.iloc[spy_hist.index.get_loc(today, method='nearest')]['Close']

        # calculate future date and get spy data
        future_index = today_index + num_days
        future_date = market_open_days[future_index]
        future_close = spy_hist.iloc[spy_hist.index.get_loc(future_date, method='nearest')]['Close']
        return 100 * (future_close - today_close) / today_close
    return h

aggregate_action_by_asset['5_day_spy_pct_change'] = aggregate_action_by_asset['transaction_date'].apply(get_future_spy_price(5))
aggregate_action_by_asset['15_day_spy_pct_change'] = aggregate_action_by_asset['transaction_date'].apply(get_future_spy_price(15))
aggregate_action['5_day_spy_pct_change'] = aggregate_action['transaction_date'].apply(get_future_spy_price(5))
aggregate_action['15_day_spy_pct_change'] = aggregate_action['transaction_date'].apply(get_future_spy_price(15))

aggregate_action
Out[25]:
transaction_date est_amt num_trades num_purchases 5_day_spy_pct_change 15_day_spy_pct_change
0 2016-01-04 -8000.5 1 -1 -4.432377 -5.382561
1 2016-01-05 79506.5 23 13 -3.823988 -6.570312
2 2016-01-06 139004.5 13 9 -5.024638 -4.883805
3 2016-01-07 129005.0 10 10 -1.092534 -0.170072
4 2016-01-08 -32500.5 7 -1 -2.141516 0.901415
... ... ... ... ... ... ...
1053 2020-10-08 32500.5 1 1 1.082088 -4.014192
1054 2020-10-13 -24001.5 3 -3 -1.927855 -4.027077
1055 2020-10-16 48003.0 6 6 -0.434798 0.826397
1056 2020-10-27 -3000000.5 1 -1 -0.647508 6.622906
1057 2020-10-30 8000.5 3 1 7.233415 8.816677

1058 rows × 6 columns

Now that we have our data, we can begin plotting. We want to plot value of trades vs. future value and number of purchases vs. future value for both 5 and 15 days in the future.

In [26]:
def set_subplot(ax, x, y, title, x_label, y_label):
    ax.scatter(x, y)
    ax.set_title(title)
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    ax.grid()

def plot_trades_vs_future_spy_price(trades, asset_type):
    fig, axs = plt.subplots(1, 4, figsize=(24, 5))
    fig.suptitle('Trades vs. Future SPY Price % Change for ' + asset_type)

    set_subplot(axs[0], trades['est_amt'], trades['5_day_spy_pct_change'], '5 day', 'Net Value of Assets Bought/Sold', '5 Day % Change in S&P500')    
    set_subplot(axs[1], trades['num_purchases'], trades['5_day_spy_pct_change'], '5 day', 'Number of Purchases (Sales count as negative)', '5 Day % Change in S&P500')

    set_subplot(axs[2], trades['est_amt'], trades['15_day_spy_pct_change'], '15 day', 'Net Value of Assets Bought/Sold', '15 Day % Change in S&P500')    
    set_subplot(axs[3], trades['num_purchases'], trades['15_day_spy_pct_change'], '15 day', 'Number of Purchases (Sales count as negative)', '15 Day % Change in S&P500')
    plt.show()

for asset_type in aggregate_action_by_asset['asset_type'].unique():
    trades_by_asset = aggregate_action_by_asset[aggregate_action_by_asset['asset_type'] == asset_type]
    plot_trades_vs_future_spy_price(trades_by_asset, asset_type)

plot_trades_vs_future_spy_price(aggregate_action, 'Overall')

There aren't any obvious trends in most of the categories. However, we can still try to uncover trends by building linear regression models and performing hypothesis tests with this data. More on that in the machine learning and hypothesis testing section.

Grouping By Senator

In [27]:
aggregate_action_by_senator = df.groupby([pd.Grouper(key='transaction_date', freq='D'), 'asset_type', 'type', 'senator']).sum()

# Groupby makes transaction_date and type an index, we want to convert those back into a column
aggregate_action_by_senator.reset_index(inplace=True)

# limit the dataset to a date range
aggregate_action_by_senator = aggregate_action_by_senator[(aggregate_action_by_senator['transaction_date'] > start_date) & (aggregate_action_by_senator['transaction_date'] < end_date)]

aggregate_action_by_senator['est_amt'] = aggregate_action_by_senator.apply(lambda row: row['est_amt'] if row['type'] == 'Purchase' else -row['est_amt'], axis=1)

aggregate_action_by_senator['5_day_spy_pct_change'] = aggregate_action_by_senator['transaction_date'].apply(get_future_spy_price(5))
# aggregate_total['15_day_spy_pct_change'] = aggregate_total['transaction_date'].apply(get_future_spy_price(15))


    # plt.figure(figsize=(30,10))
    # plt.scatter(senator_actions['mid_amt'], testx['5_day_spy_pct_change'])
    # plt.xlabel('Estimated Amount of Securities Purchased/Sold')
    # plt.show()

    # plt.figure(figsize=(30,10))
    # plt.scatter(test['mid_amt'], test['15_day_spy_pct_change'])
    # plt.show()
In [28]:
print(aggregate_action_by_senator)

for senator in aggregate_action_by_senator['senator'].unique():

    senator_actions = aggregate_action_by_senator[aggregate_action_by_senator['senator'] == senator]
   
    plt.axvline(0, color='black')
    plt.axhline(0, color='black')
    plt.grid()

    plt.scatter(senator_actions['est_amt'], senator_actions['5_day_spy_pct_change'])
    plt.show()
     transaction_date          asset_type      type             senator  \
0          2016-01-04               Stock      Sale    Patrick J Toomey   
1          2016-01-05  Municipal Security  Purchase     Lamar Alexander   
2          2016-01-05               Stock  Purchase    Patrick J Toomey   
3          2016-01-05               Stock  Purchase  Sheldon Whitehouse   
4          2016-01-05               Stock  Purchase     William Cassidy   
...               ...                 ...       ...                 ...   
2550       2020-10-13               Stock      Sale         Pat Roberts   
2551       2020-10-16               Stock  Purchase         Ron L Wyden   
2552       2020-10-27    Other Securities      Sale       Mark R Warner   
2553       2020-10-30    Other Securities  Purchase     Thomas R Carper   
2554       2020-10-30    Other Securities      Sale     Thomas R Carper   

        est_amt  5_day_spy_pct_change  
0        8000.5             -4.432377  
1       32500.5             -3.823988  
2       24001.5             -3.823988  
3       96006.0             -3.823988  
4       16001.0             -3.823988  
...         ...                   ...  
2550    16001.0             -1.927855  
2551    48003.0             -0.434798  
2552  3000000.5             -0.647508  
2553    16001.0              7.233415  
2554     8000.5              7.233415  

[2555 rows x 6 columns]

Evaluating Every Single Trade

In [29]:
hists = {}

# returns a function that gets the spy price num_days from today
def get_future_stock_price(ticker, today, num_days):

    # print('Finding %change in price after', num_days, 'days for', ticker, 'on', today)

    try:
        stock_history = yf.Ticker(ticker).history(period='5y')

        # calculate current stock data
        today_index = market_open_days.get_loc(today, method='nearest')
        today_close = stock_history.iloc[stock_history.index.get_loc(today, method='nearest')]['Close']

        # calculate future date and get stock data
        future_index = today_index + num_days
        future_date = market_open_days[future_index]
        future_close = stock_history.iloc[stock_history.index.get_loc(future_date, method='nearest')]['Close']
        return (future_close - today_close) / today_close
    except Exception as e:
        return np.nan
In [30]:
stock_trades = df[df['ticker'] != '--']

# we're processing an insane amount of data here, so let's save this dataframe back to the disk
# so we can read it back easily
try:
    stock_trades = pd.read_csv('stock_trading_performance.csv')
except FileNotFoundError:
    print('calculating...')
    stock_trades['5_day_pct_change'] = stock_trades.apply(lambda row: get_future_stock_price(row['ticker'], row['transaction_date'], 5), axis=1)
    print('done with 5 day')
    stock_trades['15_day_pct_change'] = stock_trades.apply(lambda row: get_future_stock_price(row['ticker'], row['transaction_date'], 15), axis=1)
    print('done with 15 day')
    stock_trades['30_day_pct_change'] = stock_trades.apply(lambda row: get_future_stock_price(row['ticker'], row['transaction_date'], 30), axis=1)
    print('done with 15 day')
    # stock_trades.apply(lambda row: get_future_stock_price(row['ticker'], row['transaction_date'], 5), axis=1)
    stock_trades.to_csv('stock_trading_performance.csv')
In [31]:
print(stock_trades['owner'].unique())
print(len(stock_trades['senator'].unique()))
['Spouse' 'Child' 'Joint' 'Self']
39
In [32]:
stock_trades['5_day_pct_change'] = pd.to_numeric(stock_trades['5_day_pct_change'], errors='coerce')
stock_trades['15_day_pct_change'] = pd.to_numeric(stock_trades['15_day_pct_change'], errors='coerce')

for senator in stock_trades['senator'].unique():
    for owner in stock_trades['owner'].unique():
        print(senator, owner)

        stock_trades_by_senator = stock_trades[(stock_trades['senator'] == senator) & (stock_trades['owner'] == owner)]

        plt.scatter(stock_trades_by_senator['est_amt'], stock_trades_by_senator['5_day_pct_change'])
        plt.show()
# plt.scatter(stock_trades['mid_amt'], stock_trades['5_day_pct_change'])
# plt.show()
# plt.scatter(stock_trades['mid_amt'], stock_trades['15_day_pct_change'])
William Cassidy Spouse
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'est_amt'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-32-546d18c53689> in <module>
      8         stock_trades_by_senator = stock_trades[(stock_trades['senator'] == senator) & (stock_trades['owner'] == owner)]
      9 
---> 10         plt.scatter(stock_trades_by_senator['est_amt'], stock_trades_by_senator['5_day_pct_change'])
     11         plt.show()
     12 # plt.scatter(stock_trades['mid_amt'], stock_trades['5_day_pct_change'])

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2798             if self.columns.nlevels > 1:
   2799                 return self._getitem_multilevel(key)
-> 2800             indexer = self.columns.get_loc(key)
   2801             if is_integer(indexer):
   2802                 indexer = [indexer]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'est_amt'
In [16]:
for senator in stock_trades['senator'].unique():
    stock_trades_by_senator = stock_trades[stock_trades['senator'] == senator]

    fig, axs = plt.subplots(1, 2, figsize=(12, 5))
    fig.suptitle('Trade Value vs. Future Stock Price % Change for Senator ' + senator)

    axs[0].scatter(stock_trades_by_senator['est_amt'], stock_trades_by_senator['5_day_pct_change'])
    axs[0].set_title('5 Day')
    axs[0].set_xlabel('Net Value of Assets Bought/Sold')
    axs[0].set_ylabel('5 Day % Change in Stock Price')
    axs[1].scatter(stock_trades_by_senator['est_amt'], stock_trades_by_senator['15_day_pct_change'])
    axs[1].set_title('15 Day')
    axs[1].set_xlabel('Net Value of Assets Bought/Sold')
    axs[1].set_ylabel('15 Day % Change in Stock Price')

    axs[0].grid()
    axs[1].grid()

    # plt.
    # plt.title('Trade vs. 5 Day Future Stock Price % Change for Senator ' + senator)
    plt.show()
In [ ]: